An Empirical Framework for Automatically Selecting the Best Bayesian Classifier
نویسندگان
چکیده
Data miners have access to a significant number of classifiers and use them on a variety of different types of dataset. This large selection makes it difficult to know which classifier will perform most effectively in any given case. Usually an understanding of learning algorithms is combined with detailed domain knowledge of the dataset at hand to lead to the choice of a classifier. We propose an empirical framework that quantitatively assesses the accuracy of a selection of classifiers on different datasets, resulting in a set of classification rules generated by the J48 decision tree algorithm. Data miners can follow these rules to select the most effective classifier for their work. By optimising the parameters used for learning and the sampling techniques applied, a set of rules were learned that select with 78% accuracy, the most effective classifier.
منابع مشابه
Parametric Empirical Bayes Test and Its Application to Selection of Wavelet Threshold
In this article, we propose a new method for selecting level dependent threshold in wavelet shrinkage using the empirical Bayes framework. We employ both Bayesian and frequentist testing hypothesis instead of point estimation method. The best test yields the best prior and hence the more appropriate wavelet thresholds. The standard model functions are used to illustrate the performance of the p...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملA New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملارتقای کیفیت دستهبندی متون با استفاده از کمیته دستهبند دو سطحی
Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009